knitr::opts_chunk$set(
fig.align = "center", message = F, warning = F, cache = T, cache.lazy = F,
class.source = "fold-hide"
)
If packages are already installed on your system, you could just use library(tibble). Otherwise you have to install it first, here you have multiple options:
install.packages("insert_your_package_name")pacman::p_load("insert_your_package_name") This loads your package and installs it if not already present.Usually, the packages you install are retrieved from CRAN: https://cran.r-project.org/ but not all packages are hosted there. Especially for bio-related packages, there is Bioconductor: https://bioconductor.org/
To install packages from Bioconductor, you can either follow the instructions on the package site on Bioconductor (e.g. https://bioconductor.org/packages/release/bioc/html/ComplexHeatmap.html ) which includes installing the BiocManager package.
Also pacman also searches Bioconductor, but only if BiocManager is installed before!
Some packages are neither on CRAN nor on Bioconductor but on github. github is a platform to versionize and distribute code. To install github packages, you can:
devtools::install_github("JinmiaoChenLab/Rphenograph")pacman::p_load_gh("JinmiaoChenLab/Rphenograph")Additionally, you might need Rtools to build R packages: https://cran.r-project.org/bin/windows/Rtools/rtools40.html If you just installed rtools you will need to restart Rstudio.
if (!require(pacman)) {
install.packages("pacman") # If not already installed
}
pacman::p_load("tidyverse", install = TRUE)
pacman::p_load("cowplot", install = TRUE)
pacman::p_load("uwot", install = TRUE)
pacman::p_load("data.table", install = TRUE)
pacman::p_load("pheatmap", install = TRUE) # Todo
pacman::p_load("needs", install = TRUE)
pacman::p_load("knitr", install = TRUE)
pacman::p_load("ggridges", install = TRUE)
pacman::p_load("grDevices", install = TRUE)
# Install BiocManager such that pacman can then find ComplexHeatmap by its own
pacman::p_load("BiocManager", install = TRUE)
pacman::p_load("ComplexHeatmap", install = TRUE)
# Github packages have to be dealt in a special way
# However you install them, you _need_ the "devtools" package
pacman::p_load("devtools", install = TRUE)
# Additionally, you might need "Rtools"
pacman::p_load_gh("JinmiaoChenLab/Rphenograph", install = TRUE)
pacman::p_load_gh("JinmiaoChenLab/cytofkit", install = TRUE)
prioritize(dplyr)
count <- dplyr::count
theme_set(theme_classic())
In this tutorial we will analyse a public data set (Georg et al. 2021) of single cell phospho-proteomics measured with Cytometry by Time of Flight (CyTOF). In this study, the authors performed CyTOF of whole blood samples from mild and severe COVID-19 patients during the acute and convalescent phase, and patients with other acute respiratory infections (Flu-like illness), as well as patients chronically infected by human immunodeficiency virus (HIV) or hepatitis B (HBV) and healthy controls. They analysed the T cell space and identified highly activated CD16+ T cells in severe COVID-19, which led the authors to hypothesise about the pathological role of these cytotoxic T cells. This hypothesis was then tested and confirmed with functional analyses, and found suitable mechanisms for their induction. In this tutorial you will learn how to perform such computational analysis either in T cells, B cells, or monocytes!
Due to time constrains, we will analyse 5% of the data set. We will start by manually pre-gating the immune cell type of interest, then we will visually explore the data by reducing the dimensionality and plotting a UMAP. After this, we will cluster the data to find discrete communities of cells, using two different algorithms for comparison. Then we will annotate/give names to these clusters by looking at the average protein expression in each community/cluster. Finally, we will calculate the abundance of each cluster per donor and identify COVID-19 or severe-specific clusters.
We start reading the data table “data_norm_sub.csv” with the function read.csv(). * The data has already been pre-processed + calibration beads excluded (gating on Ce140 Bead channel) + doublets and debris excluded (gating on DNA channels and Event_length) + dead cells excluded (gating on Live-Dead mDOTA marker) + batch corrected (normalization using BatchAdjust method, linearly scales signal distributions to similar ranges using percentiles)
# results='hide' does not print messages to the generated output file
# Standard read.csv, works for most things, but fread is faster.
# data_norm_sub <- read.csv("~/Documents/INsTRuCT_workshop/data_norm_sub5.csv")
data_norm_sub_fread <- data.table::fread("../data/data_norm_sub5.csv")
# fread is a special format, such that we have to convert it first into a usual data.frame
data_norm_sub <- data.frame(data_norm_sub_fread)
What are the columns? What are the rows?
# Show the first 5 rows and all columns --> Rows are cells, columns are features per cell
print(data_norm_sub[1:5, ])
## cellid Run FCS.Filename id
## 1 14635738 200527 200527_Blut_Panel1_CV19_BC_10_viable.fcs HC-4, r200527
## 2 1131553 210130 210130_Blut_Panel1_CV19_BC_3_viable.fcs CV-133, month 3
## 3 10833199 200619 200619_Blut_Panel1_CV19_BC_1_viable.fcs HC-IJ, r200619
## 4 2711646 210129 210129_Blut_Panel1_CV19_BC_4_viable.fcs CV29, month 6
## 5 2706623 210129 210129_Blut_Panel1_CV19_BC_4_viable.fcs CV29, month 6
## Individuals Group Severity Disease.phase max..WHO.scale sev_merge
## 1 HC-4 HC healthy acute NA healthy
## 2 CV-133 CV19 critical convalescent 7 severe/critical
## 3 HC-IJ HC healthy acute NA healthy
## 4 CV29 CV19 moderate convalescent 4 mild/moderate
## 5 CV29 CV19 moderate convalescent 4 mild/moderate
## Days.post.symptom.onset Week sev_week followup
## 1 <NA> <NA> healthy <NA>
## 2 month 3 month 3 severe/critical, month 3 fatigue
## 3 <NA> <NA> healthy <NA>
## 4 month 6 month 6 mild/moderate, month 6 no-fatigue
## 5 month 6 month 6 mild/moderate, month 6 no-fatigue
## CD45 CD3 CD19 CD15 CD8 TCRgd CD62L
## 1 289.606079 150.939178 0.9167342 0.00000 242.0820007 0.00000 0.9683754
## 2 4.722483 0.000000 0.0000000 27.07735 5.3527999 0.00000 202.9331512
## 3 26.286074 0.000000 0.5317233 0.00000 0.6332178 0.00000 115.2458649
## 4 10.766818 9.825749 1.6423904 14.03760 20.0084972 14.71366 185.6015778
## 5 306.810120 94.136932 0.0000000 0.00000 37.2872429 0.00000 10.7689219
## CD45RO CD28 CD27 CD226 ICOS PD1 Lag3
## 1 3.203919 0.0998394 1.028889 32.7283173 1.215678 2.335296 0.31030640
## 2 74.573257 0.0000000 0.000000 1.5003730 0.000000 2.362798 2.02808976
## 3 38.330097 0.0000000 0.000000 1.0449100 0.000000 0.000000 9.55651665
## 4 37.360798 3.2174973 1.021154 0.3054930 17.149399 10.483729 0.07147276
## 5 47.494534 42.9645615 78.948204 0.3698978 0.000000 0.000000 0.00000000
## TIGIT CD96 CD25 CD56 HLADR CD38 CD137
## 1 2.4361346 1.660355 0.000000 33.5058823 0.0000000 1.3536147 0.000000
## 2 8.1984510 6.666277 0.000000 0.1530982 0.0000000 21.0363445 0.000000
## 3 0.0000000 5.227810 3.529121 0.0000000 0.0000000 10.1701736 0.000000
## 4 19.8741760 11.719555 3.712134 0.0000000 0.7540856 24.1808929 3.163666
## 5 0.1908786 2.546410 8.808489 0.0000000 0.0000000 0.2533335 0.000000
## CD69 Ki67 CXCR3 CXCR5 CCR6 CRTH2 KLRB1 KLRG1
## 1 1.734452 0.00000 169.36728 0 3.7821393 4.528113 0 51.88726
## 2 6.979673 17.71251 125.74521 0 23.3952847 2.280042 0 12.58200
## 3 0.000000 26.01557 122.19785 0 1.0721407 5.629048 0 24.32183
## 4 4.214287 97.56147 82.18497 0 0.9035059 26.126760 0 19.51483
## 5 0.000000 0.00000 104.22880 0 0.6558527 8.333822 0 4.27186
## KLRF1 CD95 CD10 CD16 CD34 CD123 CD11c CD21
## 1 5.042776 16.67289 0.52912545 0.0000 2.020787001 0.000000 8.559193 0
## 2 14.849868 25.53127 53.66223526 408.7447 19.129014969 5.730159 2.268054 0
## 3 0.000000 24.80001 57.93495560 297.4183 0.005866973 0.000000 3.638553 0
## 4 6.316540 31.96333 108.64481354 320.0549 5.801159382 4.664691 8.716877 0
## 5 12.600286 15.25747 0.06902131 0.0000 0.000000000 45.156952 6.199719 0
## CD14 IgD IgM
## 1 0.0000000 1.800159 7.0955362
## 2 0.2520287 1.032365 0.5208321
## 3 5.5783634 0.000000 0.0000000
## 4 1.9058144 7.746778 6.3418970
## 5 0.0000000 0.000000 16.6123581
# Then take a look at only the column names
colnames(data_norm_sub)
## [1] "cellid" "Run"
## [3] "FCS.Filename" "id"
## [5] "Individuals" "Group"
## [7] "Severity" "Disease.phase"
## [9] "max..WHO.scale" "sev_merge"
## [11] "Days.post.symptom.onset" "Week"
## [13] "sev_week" "followup"
## [15] "CD45" "CD3"
## [17] "CD19" "CD15"
## [19] "CD8" "TCRgd"
## [21] "CD62L" "CD45RO"
## [23] "CD28" "CD27"
## [25] "CD226" "ICOS"
## [27] "PD1" "Lag3"
## [29] "TIGIT" "CD96"
## [31] "CD25" "CD56"
## [33] "HLADR" "CD38"
## [35] "CD137" "CD69"
## [37] "Ki67" "CXCR3"
## [39] "CXCR5" "CCR6"
## [41] "CRTH2" "KLRB1"
## [43] "KLRG1" "KLRF1"
## [45] "CD95" "CD10"
## [47] "CD16" "CD34"
## [49] "CD123" "CD11c"
## [51] "CD21" "CD14"
## [53] "IgD" "IgM"
Not all features of the cell are actual cell markers. We want to subset the values to only these markers. Therefore, create a vector with the name of each measured protein (what we call “the CyTOF panel”).
panel <- colnames(data_norm_sub)[15:54]
# color_severity is a named vector which we will use later.
# #xxxxxx is a html-coded color code.
color_severity <- c(
"healthy" = "#0449FF",
"FLI" = "#807F7F",
"HIV" = "#40007F",
"HBV" = "magenta",
"mild/moderate" = "#FFB651",
"severe/critical" = "#F82000"
)
Let’s look at how the values of markers are distributed:
In the CyTOF community, people commonly use the hyperbolic arcsine (asinh) transformation: \[ \rm asinh(x) = \ln(x + \sqrt{x^2+1}) \]
data_norm_sub %>%
# take only 20% of the data such that the plots are generated faster
sample_frac(0.2) %>%
# Pivot: All columns defined by "panel" will go into two columns.
# The values go into the new column "value"
# The names of the columns go into the new column "marker"
pivot_longer(names_to = "marker", values_to = "value", panel) %>%
# start plotting, x-axis is the value
ggplot(aes(x = value)) +
# We want density plots, so how the values on the x-axis are distributed
geom_density() +
# facet_wrap splits the plots according to the specified column, here "marker"
# scale="free" tells that the x-axis and y-axis are not the same for each plot
facet_wrap(~marker, scale = "free")
We notice that these are skewed distributions: Many small values, some very large values. Therefore, it makes more sense to look at these on a logarithmic scale:
# Transform all columns defined by "panel" with asinh()
data_norm_sub_trans <- data_norm_sub %>%
# Apply asinh() on all columns defined by panel
mutate_at(vars(panel), asinh)
data_norm_sub_trans %>%
# take only 20% of the data such that the plots are generated faster
sample_frac(0.2) %>%
# Pivot: All columns defined by "panel" will go into two columns.
# The values go into the new column "value"
# The names of the columns go into the new column "marker"
pivot_longer(names_to = "marker", values_to = "value", panel) %>%
# Start plotting, define "value" as x-axis
ggplot(aes(x = value)) +
# Density plots, how are the values on the x-axis distributed
geom_density() +
# Split the plots according to the "marker" column, different axis
facet_wrap(~marker, scale = "free")
Using ggplot() and geom_point(), generate a scatter plot to decide the gates.
We visualize just 10% of the data.
data_norm_sub_trans %>%
# The plots would be overwhelmed with cells being exactly 0 (CyTOF-"ownness")
dplyr::filter(CD19 > 0, CD45 > 0) %>%
# Use only 10% of the data
sample_frac(0.1) %>%
# Make a plot where the x-axis are the CD19 values, and the y-axis are the CD3 values
ggplot(aes(x = CD45, y = CD19)) +
# Each value will be a _point_ with a certain size
# and a certain transparency (alpha)
geom_point(size = 0.01, alpha = 0.1) +
# Additionally to the points, plot the density of the points in 2 dimensions
geom_density_2d() +
# Plot a rectangle with the given coordinates
# We found the coordinates by hand
# Alpha=0 because otherwise you would have a filled rectangle
geom_rect(mapping = aes(xmin = 3, xmax = 8, ymin = 4, ymax = 10), color = "black", alpha = 0)
data_norm_sub_trans %>%
dplyr::filter(
CD19 > 4 & CD19 < 10,
CD45 > 3 & CD45 < 8,
CD15 > 0, CD3 > 0
) %>%
# Plot CD15 (y-axis) vs CD3 (x-axis)
ggplot(aes(x = CD3, y = CD15)) +
# Plot points with certain size and color alpha
geom_point(size = 0.01, alpha = 0.1) +
# Make 2d-density plots
geom_density_2d() +
# Plot a rectangle
# Alpha because geom_rect makes a filled rectangle otherwise
geom_rect(mapping = aes(xmin = 0, xmax = 3.6, ymin = 0, ymax = 3), color = "black", alpha = 0)
We first add a column to the data table where each cell gets the classification Bcell = {TRUE, FALSE} according to your gating strategy. Then let’s see if the percentage of B cells make sense and how it looks per disease group (variable sev_merge).
data_norm_sub_trans <- data_norm_sub_trans %>%
# ifelse(logical_value, if TRUE, if FALSE)
# ifelse returns the value inside if TRUE when the logical
# value is TRUE, otherwise the other value.
# Draw proper gates in the transformed values to get the (CD45+CD3+CD15-CD19-) cells
mutate(
Bcells =
ifelse(
CD19 > 4 &
CD19 < 10 &
CD45 > 3 &
CD45 < 8 &
CD3 < 3.6 &
CD15 < 3, TRUE, FALSE
)
)
data_norm_sub_trans <- data_norm_sub_trans %>%
# Factors are R-internal special vectors having "levels".
# All values must be a value of "levels". Internally, a factor is a numeric value pointing
# to one of the levels.
# ggplot is smart enough to handle factors properly (better than only vectors if needed)!
mutate(
sev_merge = factor(
sev_merge,
levels = c("healthy", "FLI", "HIV", "HBV", "mild/moderate", "severe/critical")
)
)
data_norm_sub_trans %>%
# count for each combination of
# id: Sample id
# Bcells: If it is a Tcell or not
# sev_merge: Which severity it is (sev_merge = mild+moderate and severe+critical merged)
# Disease.phase: acute or covalescent
# How many cells there are
# Count saves that in a new column "n"
count(id, Bcells, sev_merge, Disease.phase) %>%
# tidyverse functions can leverage "groups" of data, but you have to specify first
# what the group is defined on.
# Here we want to group the cells based on their (sample-) id
group_by(id) %>%
# We grouped the table by id (the sample)
# Now we define perc as "n" divided by the sum of all "n"s times 100
# This is applied _inside each group_
# After each group has only two rows (Bcells = TRUE or FALSE)
# This is effectively the percentage of Bcells in each sample
mutate(perc = n / sum(n) * 100) %>%
# We do not need the grouping anymore, therefore remove it
ungroup() %>%
# Select only Bcells (As "Bcells" is a True/False column)
filter(Bcells) %>%
# plot the just calculated percentage on the y-axis
# And the severity on the x-axis
# "fill" fills plots, usually used for coloring.
ggplot(aes(y = perc, x = sev_merge, fill = sev_merge)) +
# Make a boxplot per fill-value. That happens automatically.
# Alpha sets the color-density
geom_boxplot(position = position_dodge(1), alpha = 0.7) +
# Overlay dotplot to the barplots
geom_dotplot(
binaxis = "y", stackdir = "center",
position = position_dodge(1), alpha = 0.7
) +
# facet_grid splits the plots according to multiple values,
# you could even use multiple splitting-values
# with "free_x" we allow the x-axis to vary between plots, but the y-axis is fixed.
facet_grid(~Disease.phase, space = "free_x", scale = "free_x") +
# Previously we defined the coloring for the severity stages, here we apply it.
scale_fill_manual(values = color_severity) +
# Set y-axis label
ylab("Percentage of B cells (%)") +
# Set x-axis label to empty ("")
xlab("") +
# Remove the x-axis text and ticks.
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank()
)
https://pair-code.github.io/understanding-umap/
We now compute UMAP for B cells across all samples (acute and convalescent), and using all markers, except CD45, CD3, CD19, CD15, CD8, TCRgd and CD14.
We define a vector “sel_markers_Bcells” with the selected markers to be used for the calculation of the UMAP.
# panel %in% c("text1", "text2", ...)
# Returns a vector which is TRUE if its corresponding value in "panel"
# is in the vector on the right and FALSE if it is not.
# ! panel %in% c("text1", "text2", ...)
# inverts this logical vector, so FALSE if value is contained in the right, TRUE if not.
# panel[!panel %in% c("CD45", "CD3", "CD19", "CD15", "CD8", "TCRgd", "CD14")]
# selects the TRUE values in "panel" according to the just specified vector.
sel_markers_Bcells <- panel[!panel %in% c("CD45", "CD3", "CD19", "CD15", "CD8", "TCRgd", "CD14")]
Before calculating the UMAP, is important we transform our data (asinh) and apply z-score standardization (function scale). Why?
data_Bcells <- data_norm_sub_trans %>%
# Filter all cells which are Bcells
filter(Bcells) %>%
# Then apply scale() on all columns defined by sel_markers_Bcells
mutate_at(vars(sel_markers_Bcells), scale)
UMAP_Bcells <- data_Bcells %>%
# Select only the sel_markers_Bcells columns
select(sel_markers_Bcells) %>%
# Calculate UMAP with defined hyperparameters
uwot::umap(
# The size of local neighborhood
n_neighbors = 30,
# The effective scale of embedded points. In combination with min_dist,
# this determines how clustered/clumped the embedded points are.
spread = 1,
# The effective minimum distance between embedded points
min_dist = 0.5,
# Type of distance metric to use to find nearest neighbors
metric = "euclidean",
# If TRUE, log details to the console.
verbose = TRUE,
# Setting fast_sgd to TRUE will speed up the stochastic optimization phase, but give a
# potentially less accurate embedding, and which will not be exactly reproducible
# even with a fixed seed.
fast_sgd = TRUE
)
data_Bcells$UMAP1 <- NA
data_Bcells$UMAP2 <- NA
data_Bcells$UMAP1 <- UMAP_Bcells[, 1]
data_Bcells$UMAP2 <- UMAP_Bcells[, 2]
We now plot each UMAP, coloured by severity, disease phase, and intensity of a selected marker. Recommendation: Subsample Cells for visualization, using sample_n(), or sample_frac.
Do you observe specific areas where CV19 samples accumulate? What does this mean?
data_Bcells %>%
# Plot the two just calculated UMAPs, color according to sev_merge(severity)
ggplot(aes(x = UMAP1, y = UMAP2, color = sev_merge)) +
# Plot points
geom_point(alpha = 0.5, size = 0.5) +
# Bigger and not transparent dots for the legend
guides(colour = guide_legend(ncol = 1, override.aes = list(size = 3, alpha = 1))) +
# Use our previously defined colors
scale_color_manual(values = color_severity, name = "") +
# Set a proper title!
# Always set proper titles. The idea is that you should be able to send a plot to you boss and
# he should understand what happens by just looking at the plot.
ggtitle("B cells") +
# legend.position="none" removes the legend
# plot.title = element_text(hjust...) adjusts the horizontal adjustment (placing) of the title.
theme(legend.position = "none", plot.title = element_text(hjust = 0.5))
Do you observe specific areas where convalescent samples accumulate? What does this mean?
data_Bcells %>%
# Plot the two just calculated UMAPs, color according to Disease.phase
ggplot(aes(x = UMAP1, y = UMAP2, color = Disease.phase)) +
# Plot points
geom_point(alpha = 0.5, size = 0.5) +
# Bigger and not transparent dots for the legend
guides(colour = guide_legend(ncol = 1, override.aes = list(size = 3, alpha = 1))) +
# Use our previously defined colors
scale_color_manual(values = c("acute" = "red", "convalescent" = "black"), name = "") +
# plot.title = element_text(hjust...) adjusts the horizontal adjustment (placing) of the title.
theme(plot.title = element_text(hjust = 0.5))
p1 <- data_Bcells %>%
# Plot the two just calculated UMAPs, color according to CD34 values
ggplot(aes(x = UMAP1, y = UMAP2, color = CD34)) +
# Make points
geom_point(alpha = 0.5, size = 0.5) +
# plot.title = element_text(hjust...) adjusts the horizontal adjustment (placing) of the title.
theme(plot.title = element_text(hjust = 0.5)) +
# scale_color_gradient2 sets the color-scale for the aes "color"
# Where you can define which colors low, mid and high values should have
# Then it makes two color-gradients:
# low -> mid
# mid -> low
# Mixing the colors appropriately according to the value of the given column in aes()
scale_color_gradient2(low = "blue", mid = "grey", high = "red", midpoint = 0)
p2 <- data_Bcells %>%
# Plot the two just calculated UMAPs, color according to IgM values
ggplot(aes(x = UMAP1, y = UMAP2, color = IgM)) +
# Make points
geom_point(alpha = 0.5, size = 0.5) +
# plot.title = element_text(hjust...) adjusts the horizontal adjustment (placing) of the title.
theme(plot.title = element_text(hjust = 0.5)) +
# scale_color_gradient2 sets the color-scale for the aes "color"
# Where you can define which colors low, mid and high values should have
# Then it makes two color-gradients:
# low -> mid
# mid -> low
# Mixing the colors appropriately according to the value of the given column in aes()
scale_color_gradient2(low = "blue", mid = "grey", high = "red", midpoint = 0)
p3 <- data_Bcells %>%
# Plot the two just calculated UMAPs, color according to CD38 values
ggplot(aes(x = UMAP1, y = UMAP2, color = CD38)) +
# Make points
geom_point(alpha = 0.5, size = 0.5) +
# plot.title = element_text(hjust...) adjusts the horizontal adjustment (placing) of the title.
theme(plot.title = element_text(hjust = 0.5)) +
# scale_color_gradient2 sets the color-scale for the aes "color"
# Where you can define which colors low, mid and high values should have
# Then it makes two color-gradients:
# low -> mid
# mid -> low
# Mixing the colors appropriately according to the value of the given column in aes()
scale_color_gradient2(low = "blue", mid = "grey", high = "red", midpoint = 0)
plot_grid(p1, p2, p3, nrow = 1)
We now perform unsupervised clustering analysis on samples from control, FLI, HIV, HBV, and acute COVID-19 using the selected markers. As done for the UMAP calculation, before clustering is important we transform and z-score normalize our data. Let’s look at the distribution of the markers used for clustering:
data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Keep only the columns defined by sel_markers_Bcells
select(sel_markers_Bcells) %>%
# Make the table from wide to long format
# The previous column names go into the new column "marker"
# The previous column's _values_ go into the new column "marker"
# And this should be applied to all (everything()) previous columns
pivot_longer(names_to = "marker", values_to = "value", everything()) %>%
# "value" on the x-axis
ggplot(aes(x = value)) +
# Make density plots
geom_density() +
# Split the plots according to the "marker" value, scale them freely
facet_wrap(~marker, scale = "free")
What’s the main difference between RPhenograph and FlowSOM?
IMPORTANT! In RPhenograph we define the number of nearest neighbours.
# Sys.time() returns the current time on your PC
start_time <- Sys.time()
clust_bcells_rpheno <- data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Keep only the columns defined by sel_markers_Bcells
select(sel_markers_Bcells) %>%
# Start clustering with Rphenograph
# k: number of nearest neighbours (default:30)
Rphenograph(k = 30)
# Sys.time() returns the current time on your PC
end_time <- Sys.time()
# with difftime you can calculate the difference between two times received by Sys.time()
# You can also supply the unit you want to see (here "minutes")
#
# as.numeric() then makes a PC-usable number from that
# and round(..., 2) rounds to the second place after comma
time_elapsed <- round(as.numeric(difftime(end_time, start_time, units = "mins")), 2)
# paste() combines multiple strings together in R
# "\n" is a special string to make a new line when
# cat() is used to actually print to console
print(paste("\n", time_elapsed, "minutes passed"))
## [1] "\n 0.42 minutes passed"
After running the clustering algorithm, we add a column “Rpheno” in the data table with the cluster label for each cell:
clust_ids <- data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Extract the column "cellid" from the dataframe as a usual vector
# This column uniquely identifies each cell
pull(cellid)
# Make a new tibble (table) with two columns:
# "cellid": The (unique!) cell ids
# "Rpheno": The cluster assignment for each of the cells
clust_bcells_rpheno <- tibble(
cellid = clust_ids,
Rpheno = as.character(membership(clust_bcells_rpheno))
)
# left_join here joins automatically by the column "cellid"
# So the column "cellid" is taken as identifier and the data_Bcells is extended by the
# column "Rpheno", so the cluster assignment by Rpheno
data_Bcells <- data_Bcells %>% left_join(clust_bcells_rpheno)
data_Bcells <- data_Bcells %>%
# We will use Rpheno later in plots and want to have sorted cluster numbers.
# ggplot can only do that when the column is a factor
# If the column is a factor, the ordering of values is according to its levels
mutate(Rpheno = factor(Rpheno, levels = str_sort(unique(Rpheno), numeric = TRUE)))
IMPORTANT! In FlowSOM we define the number of clusters.
# Sys.time() returns the current time on your PC
start_time <- Sys.time()
clust_bcells_flowsom <- data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Keep only the columns defined by sel_markers_Bcells
select(sel_markers_Bcells) %>%
# Start clustering with FlowSOM (from the cytofkit package)
cytofkit::cytof_cluster(
# %>% usually "pipes" the data into the FIRST argument of the called function
# but here we want to leave the first argument ("ydata") free, so we have to define
# it explicitely.
# To then "access" the piped data you can use the dot (.)
xdata = .,
# Specify the used method, here FlowSOM
method = "FlowSOM",
# Number of clusters for meta clustering in FlowSOM.
FlowSOM_k = 11
)
# Sys.time() returns the current time on your PC
end_time <- Sys.time()
# with difftime you can calculate the difference between two times received by Sys.time()
# You can also supply the unit you want to see (here "minutes")
#
# as.numeric() then makes a PC-usable number from that
# and round(..., 2) rounds to the second place after comma
time_elapsed <- round(as.numeric(difftime(end_time, start_time, units = "mins")), 2)
# paste() combines multiple strings together in R
# "\n" is a special string to make a new line when
# cat() is used to actually print to console
print(paste("\n", time_elapsed, "minutes passed"))
## [1] "\n 0.04 minutes passed"
After running the clustering algorithm, we add a column “flowsom” in the data table with the cluster label for each cell:
clust_ids <- data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Extract the column "cellid" from the dataframe as a usual vector
# This column uniquely identifies each cell
pull(cellid)
# Make a new tibble (table) with two columns:
# "cellid": The (unique!) cell ids
# "flowsom": The cluster assignment for each of the cells
clust_bcells_flowsom <- tibble(
cellid = clust_ids,
flowsom = as.character(clust_bcells_flowsom)
)
# left_join here joins automatically by the column "cellid"
# So the column "cellid" is taken as identifier and the transformed_scaled_data is extended by the
# column "flowsom", so the cluster assignment by flowsom
data_Bcells <- data_Bcells %>% left_join(clust_bcells_flowsom)
data_Bcells <- data_Bcells %>%
# We will use flowsom later in plots and want to have sorted cluster numbers.
# ggplot can only do that when the column is a factor
# If the column is a factor, the ordering of values is according to its levels
mutate(flowsom = factor(flowsom, levels = str_sort(unique(flowsom), numeric = TRUE)))
We now visualize the number of cells in each cluster:
data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Count the number of rows (cells) per value in the Rpheno (cluster-id) column
count(Rpheno) %>%
# Plot
# - Rpheno column (cluster-id) on the x-axis
# - Number of cells per Rpheno on y-axis
# - Use the number of cells as label
ggplot(aes(x = Rpheno, y = n, label = n)) +
# Make a bar chart
geom_col(position = "dodge") +
# Add labels on each bar
geom_label()
data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Count the number of rows (cells) per value in the flowsom (cluster-id) column
count(flowsom) %>%
# Plot
# - flowsom column (cluster-id) on the x-axis
# - Number of cells per flowsom on y-axis
# - Use the number of cells as label
ggplot(aes(x = flowsom, y = n, label = n)) +
# Make a bar chart
geom_col(position = "dodge") +
# Add labels on each bar
geom_label()
And then plot the UMAP, this time coloured by cluster:
data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Plot the two just calculated UMAPs, color according to Rphenograph clusters
ggplot(aes(x = UMAP1, y = UMAP2, color = Rpheno)) +
# Plot points with transparency and size defined
geom_point(alpha = 0.5, size = 1) +
# Bigger and not transparent dots for the legend
guides(colour = guide_legend(ncol = 1, override.aes = list(size = 3, alpha = 1))) +
# Setting proper title
ggtitle("B cells") +
theme(plot.title = element_text(hjust = 0.5))
data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Plot the two just calculated UMAPs, color according to flowsom clusters
ggplot(aes(x = UMAP1, y = UMAP2, color = flowsom)) +
# Plot points with transparency and size defined
geom_point(alpha = 0.5, size = 1) +
# Bigger and not transparent dots for the legend
guides(colour = guide_legend(ncol = 1, override.aes = list(size = 3, alpha = 1))) +
# Setting proper title
ggtitle("B cells") +
# Center the plot title
theme(plot.title = element_text(hjust = 0.5))
Let’s look at the marker distribution per cluster:
data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Keep only the columns defined by sel_markers_Bcells _and_ Rpheno
select(sel_markers_Bcells, Rpheno) %>%
# Make the table from wide to long format
# The previous column names go into the new column "marker"
# The previous column's _values_ go into the new column "marker"
# And this should be applied to the columns defined by sel_markers_Bcells
pivot_longer(names_to = "marker", values_to = "value", sel_markers_Bcells) %>%
# x-axis: Marker values
# y-axis: Cluster ids
# fill-color: Cluster ids
ggplot(aes(x = value, fill = Rpheno, y = Rpheno)) +
# geom_density_ridges plots a density plot for each value defined by y
geom_density_ridges() +
# Make the same kind of multi-density-plot for each marker!
facet_wrap(~marker, scale = "free")
data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Keep only the columns defined by sel_markers_Bcells _and_ flowsom
select(sel_markers_Bcells, flowsom) %>%
# Make the table from wide to long format
# The previous column names go into the new column "marker"
# The previous column's _values_ go into the new column "marker"
# And this should be applied to the columns defined by sel_markers_Bcells
pivot_longer(names_to = "marker", values_to = "value", sel_markers_Bcells) %>%
# x-axis: Marker values
# y-axis: Cluster ids
# fill-color: Cluster ids
ggplot(aes(x = value, fill = flowsom, y = flowsom)) +
# geom_density_ridges plots a density plot for each value defined by y
geom_density_ridges() +
# Make the same kind of multi-density-plot for each marker!
facet_wrap(~marker, scale = "free")
We now want to actually understand what are these clusters we found. For this, we can visualize the average expression of each marker in each cluster with a heatmap.
data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Keep only the columns defined by sel_markers_Bcells _and_ Rpheno
select(sel_markers_Bcells, Rpheno) %>%
# group by Rpheno-clusters
group_by(Rpheno) %>%
# Summarise_at "summarises" all rows (per group!) using the functions defined in list()
# "~mean(...)" is a so-called lambda function (anonymous)
# In essence: Calculate the mean _per marker_ inside each cluster (because group_by)
summarise_at(vars(sel_markers_Bcells), list(~ mean(., na.rm = TRUE))) %>%
# Make the table from wide to long format
# The previous column names go into the new column "marker"
# The previous column's _values_ go into the new column "marker"
# And this should be applied to all columns EXCEPT Rpheno (the "-" tells that)
pivot_longer(names_to = "markers", values_to = "avg_zscore", -Rpheno) %>%
# Modify the column "markers" into being a factor with a specified order of levels
# (the reverse order from sel_markers_Bcells)
mutate(markers = factor(markers, levels = sel_markers_Bcells)) %>%
# Start plot:
# - x-axis: Cluster
# - y-axis: The marker
# - fill color: The just calculated average score
ggplot(aes(x = Rpheno, y = markers, fill = avg_zscore)) +
# Make a heatmap
geom_tile() +
# Replace the default color scale
scale_fill_gradient2(low = "blue", mid = "white", high = "red", limits = c(-3, 7.5))
data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Keep only the columns defined by sel_markers_Bcells _and_ flowsom
select(sel_markers_Bcells, flowsom) %>%
# group by flowsom-clusters
group_by(flowsom) %>%
# Summarise_at "summarises" all rows (per group!) using the functions defined in list()
# "~mean(...)" is a so-called lambda function (anonymous)
# In essence: Calculate the mean _per marker_ inside each cluster (because group_by)
summarise_at(vars(sel_markers_Bcells), list(~ mean(., na.rm = TRUE))) %>%
# Make the table from wide to long format
# The previous column names go into the new column "marker"
# The previous column's _values_ go into the new column "marker"
# And this should be applied to all columns EXCEPT flowsom (the "-" tells that)
pivot_longer(names_to = "markers", values_to = "avg_zscore", -flowsom) %>%
# Modify the column "markers" into being a factor with a specified order of levels
# (the reverse order from sel_markers_Bcells)
mutate(markers = factor(markers, levels = sel_markers_Bcells)) %>%
# Start plot:
# - x-axis: Cluster
# - y-axis: The marker
# - fill color: The just calculated average score
ggplot(aes(x = flowsom, y = markers, fill = avg_zscore)) +
# Make a heatmap
geom_tile() +
# Replace the default color scale
scale_fill_gradient2(low = "blue", mid = "white", high = "red", limits = c(-3, 7.5))
With the library “pheatmap” we can additionally group the clusters (dendogram) and group the markers by category (annotation).
data_heatmap <- data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Keep only the columns defined by sel_markers_Bcells _and_ Rpheno
select(sel_markers_Bcells, Rpheno) %>%
# group by Rpheno-clusters
group_by(Rpheno) %>%
# Summarise_at "summarises" all rows (per group!) using the functions defined in list()
# "~mean(...)" is a so-called lambda function (anonymous)
# In essence: Calculate the mean _per marker_ inside each cluster (because group_by)
summarise_at(vars(sel_markers_Bcells), list(~ mean(., na.rm = TRUE))) %>%
# tibbles usually do not have row-names, but some (usually classical) functions
# make use of the rownames
# Therefore, let's make the Rpheno-column the rownames!
column_to_rownames("Rpheno")
bcell_markers <- c(
'IgD','IgM','CD10','CD21','CD27','CD38','CXCR3','CXCR5','CCR6','CRTH2',
'HLADR','CD69','Ki67','CD11c','CD25','CD95','CD137','CD226','ICOS','PD1',
'Lag3','TIGIT','CD96','CD123','KLRG1','KLRF1','KLRB1','CD16','CD28','CD62L',
'CD45RO','CD56','CD34')
# Create a data frame for the marker categories
annotation_rows <- data.frame(
marker_category = c(
rep("Differentiation",6),
rep("Chemokine receptor",4),
rep("Activation",3),
rep("Miscellaneous",20))
)
# The sel_markers_Tcells should be the rownames of the annotations
rownames(annotation_rows) <- bcell_markers
data_heatmap[, bcell_markers] %>%
# t() transposes data_heatmap
# apart from that it automatically converts the tibble into a matrix
t() %>%
pheatmap(
# Do not cluster the rows when plotting
cluster_rows = FALSE,
# Do cluster the columns when plotting
cluster_cols = TRUE,
# Add the annotation per row
annotation_row = annotation_rows,
# Do not show the annotation_row column name
annotation_names_row = FALSE,
# vector of row indices that show where to put gaps into heatmap. Used only if the rows are not clustered. See cutree_row to see how to introduce gaps to clustered rows.
gaps_row = c(6, 10, 13),
# a sequence of numbers that covers the range of values in mat and
# is one element longer than color vector.
# Used for mapping values to colors.
breaks = seq(-4, 4, length.out = 100),
# For every value in "breaks", generate a corresponding color value with colorRampPalette
# This then goes
# blue (minimum) -> white (midpoint) -> red (maximum)
color = colorRampPalette(c("blue", "white", "red"))(100)
)
data_heatmap <- data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# Keep only the columns defined by sel_markers_Bcells _and_ flowsom
select(sel_markers_Bcells, flowsom) %>%
# group by flowsom-clusters
group_by(flowsom) %>%
# Summarise_at "summarises" all rows (per group!) using the functions defined in list()
# "~mean(...)" is a so-called lambda function (anonymous)
# In essence: Calculate the mean _per marker_ inside each cluster (because group_by)
summarise_at(vars(sel_markers_Bcells), list(~ mean(., na.rm = TRUE))) %>%
# tibbles usually do not have row-names, but some (usually classical) functions
# make use of the rownames
# Therefore, let's make the flowsom-column the rownames!
column_to_rownames("flowsom")
bcell_markers <- c(
'IgD','IgM','CD10','CD21','CD27','CD38','CXCR3','CXCR5','CCR6','CRTH2',
'HLADR','CD69','Ki67','CD11c','CD25','CD95','CD137','CD226','ICOS','PD1',
'Lag3','TIGIT','CD96','CD123','KLRG1','KLRF1','KLRB1','CD16','CD28','CD62L',
'CD45RO','CD56','CD34')
# Create a data frame for the marker categories
annotation_rows <- data.frame(
marker_category = c(
rep("Differentiation",6),
rep("Chemokine receptor",4),
rep("Activation",3),
rep("Miscellaneous",20))
)
# The sel_markers_Tcells should be the rownames of the annotations
rownames(annotation_rows) <- bcell_markers
data_heatmap[, bcell_markers] %>%
# t() transposes data_heatmap
# apart from that it automatically converts the tibble into a matrix
t() %>%
pheatmap(
# Do not cluster the rows when plotting
cluster_rows = FALSE,
# Do cluster the columns when plotting
cluster_cols = TRUE,
# Add the annotation per row
annotation_row = annotation_rows,
# Do not show the annotation_row column name
annotation_names_row = FALSE,
# vector of row indices that show where to put gaps into heatmap. Used only if the rows are not clustered. See cutree_row to see how to introduce gaps to clustered rows.
gaps_row = c(6, 10, 13),
# a sequence of numbers that covers the range of values in mat and
# is one element longer than color vector.
# Used for mapping values to colors.
breaks = seq(-4, 4, length.out = 100),
# For every value in "breaks", generate a corresponding color value with colorRampPalette
# This then goes
# blue (minimum) -> white (midpoint) -> red (maximum)
color = colorRampPalette(c("blue", "white", "red"))(100)
)
How much percentage of cells from RPhenograph cluster X where classified in FlowSOM cluster Y?
data_heatmap <- data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# count for each combination of
# Rpheno: Cluster from Rphenograph
# flowsom: Cluster from flowsom
# How many cells there are
# Count saves that in a new column "n"
count(Rpheno, flowsom) %>%
# When using the function count(), if a cluster is absent in a donor, it will not be counted as zero.
# So we complete the count table by filling the missing clusters with a 0.
tidyr::complete(Rpheno, flowsom, fill = list(n = 0)) %>%
# group by Rpheno-clusters
group_by(Rpheno) %>%
# due to the grouping, sum(n) gets the number of cells in each Rpheno-cluster
mutate(total_Rpheno = sum(n)) %>%
# Remove the grouping
ungroup() %>%
# Get the percentage of cells (per row) in total_Rpheno
# After total_Rpheno was calculated per Rpheno-cluster this results in the percentage
# of cells of each flowSOM cluster being in the respective Rpheno-cluster
mutate(perc = n / total_Rpheno * 100) %>%
# Select only the columns Rpheno, flowsom, perc
select(Rpheno, flowsom, perc) %>%
# Now generate a matrix-like visualization by moving the Rpheno-column into separate columns
# based on the Rpheno-value
pivot_wider(names_from = Rpheno, values_from = "perc") %>%
# Move the flowsom column to the rownames.
column_to_rownames("flowsom")
pheatmap(
data_heatmap,
# Do cluster the rows when plotting
cluster_rows = TRUE,
# Do cluster the columns when plotting
cluster_cols = TRUE,
# Create custom labels for the rows
labels_row = paste0(rownames(data_heatmap), "_flowsom"),
# Create custom labels for the columns
labels_col = paste0(colnames(data_heatmap), "_rpheno")
)
Finally, we can calculate the abundance of each cluster in each sample and check if there are COVID-specific or severity-specific groups of cells.
ACHTUNG! In this data set, donors where sampled multiple times during the disease course. We establish an arbitrary rule of choosing the first sample per donor (usually during the first week post-symptom onset) if multiple samples are available.
selected_ids <- data_Bcells %>%
# Keep cells from "acute" disease phase
filter(Disease.phase == "acute") %>%
# count for each combination of
# Individuals: Patient id (there are potentially multiple samples per patient)
# id: Sample id
# Group: "HC" "CV19" "HBV" "HIV" "FLI"
# sev_merge: Which severity it is (sev_merge = mild+moderate and severe+critical merged)
# Days.post.symptom.onset: Time of measurement after symptom onset
# How many cells there are
# Count saves that in a new column "n"
count(Individuals, id, Group, sev_merge, Days.post.symptom.onset) %>%
# Remove the "n" column
select(-n) %>%
# Group by Individual
group_by(Individuals) %>%
# Inside each Individual, create a new column "n_samples"
# which always starts from 1 and goes up to the number of samples of that individual (which is the number of rows in that group)
mutate(n_samples = 1:n()) %>%
# Select the id where the Days.post.symptom.onset is the minimum
mutate(select_id = ifelse(n_samples == 1, TRUE,
ifelse(min(as.numeric(Days.post.symptom.onset)) == as.numeric(Days.post.symptom.onset),
TRUE,
FALSE
)
)) %>%
# Get only the rows where select_id column is TRUE
filter(select_id) %>%
# Get the ids
pull(id)
We can visualize the abundance of each cluster per donor as boxplots per severity group.
data_Bcells %>%
# Keep all samples (id) which are in the selected_ids
filter(id %in% selected_ids) %>%
# count for each combination of
# id: Sample id
# sev_merge: Which severity it is (sev_merge = mild+moderate and severe+critical merged)
# Rpheno: Cluster from Rphenograph
# How many cells there are
# Count saves that in a new column "n"
count(id, sev_merge, Rpheno) %>%
# Group by sample ID
group_by(id) %>%
# Get the percentage of rows per group (so... per sample)
mutate(perc = n / sum(n) * 100) %>%
# Remove grouping
ungroup() %>%
# When using the function count(), if a cluster is absent in a donor, it will not be counted as zero.
# So we complete the count table by filling the missing clusters with a 0.
tidyr::complete(id, Rpheno, fill = list(n = 0, perc = 0)) %>%
# Group per sample
group_by(id) %>%
# after we grouped per sample, each sample has its own sev_merge value for all clusters!
mutate(sev_merge = unique(sev_merge[!is.na(sev_merge)])) %>%
# Remove grouping
ungroup() %>%
# Plot:
# x-axis: sev_merge
# y-axis: percentage of cells in the cluster
# fill color: sev_merge
ggplot(aes(x = sev_merge, y = perc, fill = sev_merge)) +
# Make boxplots
geom_boxplot() +
# Add the values as points
geom_jitter() +
# Split the previous plot into each separate cluster
facet_wrap(~Rpheno, scale = "free") +
# Previously we defined the coloring for the severity stages, here we apply it.
scale_fill_manual(values = color_severity) +
# Remove the x-axis text
theme(axis.text.x = element_blank()) +
# Set a title
ggtitle("RPhenograph cluster abundance")
data_Bcells %>%
# Keep all samples (id) which are in the selected_ids
filter(id %in% selected_ids) %>%
# count for each combination of
# id: Sample id
# sev_merge: Which severity it is (sev_merge = mild+moderate and severe+critical merged)
# flowsom: Cluster from flowsom
# How many cells there are
# Count saves that in a new column "n"
count(id, sev_merge, flowsom) %>%
# Group by sample ID
group_by(id) %>%
# Get the percentage of rows per group (so... per sample)
mutate(perc = n / sum(n) * 100) %>%
# Remove grouping
ungroup() %>%
# When using the function count(), if a cluster is absent in a donor, it will not be counted as zero.
# So we complete the count table by filling the missing clusters with a 0.
tidyr::complete(id, flowsom, fill = list(n = 0, perc = 0)) %>%
# Group per sample
group_by(id) %>%
# after we grouped per sample, each sample has its own sev_merge value for all clusters!
mutate(sev_merge = unique(sev_merge[!is.na(sev_merge)])) %>%
# Remove grouping
ungroup() %>%
# Plot:
# x-axis: sev_merge
# y-axis: percentage of cells in the cluster
# fill color: sev_merge
ggplot(aes(x = sev_merge, y = perc, fill = sev_merge)) +
# Make boxplots
geom_boxplot() +
# Add the values as points
geom_jitter() +
# Split the previous plot into each separate cluster
facet_wrap(~flowsom, scale = "free") +
# Previously we defined the coloring for the severity stages, here we apply it.
scale_fill_manual(values = color_severity) +
# Remove the x-axis text
theme(axis.text.x = element_blank()) +
# Set a title
ggtitle("FlowSom cluster abundance")
Make a presentation/talk (up to 20 minutes long, 10 minutes discussion) about the analysis pipeline that you familiarized yourself with. Mention what cell type you looked at, and present the results with your interpretation given the context of the experimental design.
Doesn’t have to be a proper PowerPoint, you can just open the R Markdown and go through it.
Following questions should be touched upon:
Pre-gating:
UMAP:
Unsupervised Clustering:
What are the differences between algorithms?
Which one gave better results?
What are the situations where you would prefer using the other one?
Cluster annotation
Interpretation of results in the context of acute COVID-19 using heatmaps and cluster abundances.
Future plans:
print("Done")
## [1] "Done"
print(session_info())
## - Session info ---------------------------------------------------------------
## setting value
## version R version 4.1.3 (2022-03-10)
## os Windows 10 x64 (build 22000)
## system x86_64, mingw32
## ui RTerm
## language (EN)
## collate German_Germany.1252
## ctype German_Germany.1252
## tz Europe/Berlin
## date 2022-04-26
## pandoc 2.14.0.3 @ D:/Programme/RStudio/bin/pandoc/ (via rmarkdown)
##
## - Packages -------------------------------------------------------------------
## ! package * version date (UTC) lib source
## abind 1.4-5 2016-07-21 [1] CRAN (R 4.1.1)
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.3)
## aws.s3 0.3.21 2020-04-07 [1] CRAN (R 4.1.3)
## aws.signature 0.6.0 2020-06-01 [1] CRAN (R 4.1.3)
## backports 1.4.1 2021-12-13 [1] CRAN (R 4.1.2)
## base64enc 0.1-3 2015-07-28 [1] CRAN (R 4.1.1)
## Biobase 2.54.0 2021-10-26 [1] Bioconductor
## BiocGenerics 0.40.0 2021-10-26 [1] Bioconductor
## BiocManager * 1.30.16 2021-06-15 [1] CRAN (R 4.1.3)
## bitops 1.0-7 2021-04-24 [1] CRAN (R 4.1.1)
## boot 1.3-28 2021-05-03 [1] CRAN (R 4.1.3)
## brio 1.1.3 2021-11-30 [1] CRAN (R 4.1.3)
## broom 0.8.0 2022-04-13 [1] CRAN (R 4.1.3)
## bslib 0.3.1 2021-10-06 [1] CRAN (R 4.1.3)
## cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.3)
## callr 3.7.0 2021-04-20 [1] CRAN (R 4.1.3)
## car 3.0-12 2021-11-06 [1] CRAN (R 4.1.3)
## carData 3.0-5 2022-01-06 [1] CRAN (R 4.1.3)
## caTools 1.18.2 2021-03-28 [1] CRAN (R 4.1.3)
## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.1.3)
## circlize 0.4.14 2022-02-11 [1] CRAN (R 4.1.3)
## class 7.3-20 2022-01-16 [1] CRAN (R 4.1.3)
## cli 3.2.0 2022-02-14 [1] CRAN (R 4.1.3)
## clue 0.3-60 2021-10-11 [1] CRAN (R 4.1.3)
## cluster 2.1.2 2021-04-17 [1] CRAN (R 4.1.3)
## codetools 0.2-18 2020-11-04 [1] CRAN (R 4.1.3)
## colorRamps 2.3 2012-10-29 [1] CRAN (R 4.1.1)
## colorspace 2.0-3 2022-02-21 [1] CRAN (R 4.1.3)
## colourpicker 1.1.1 2021-10-04 [1] CRAN (R 4.1.3)
## ComplexHeatmap * 2.10.0 2021-10-26 [1] Bioconductor
## ConsensusClusterPlus 1.58.0 2021-10-26 [1] Bioconductor
## cowplot * 1.1.1 2020-12-30 [1] CRAN (R 4.1.3)
## crayon 1.5.1 2022-03-26 [1] CRAN (R 4.1.3)
## curl 4.3.2 2021-06-23 [1] CRAN (R 4.1.3)
## cytofkit * 0.99.0 2022-04-23 [1] Github (JinmiaoChenLab/cytofkit@3d13408)
## cytolib 2.6.2 2022-02-08 [1] Bioconductor
## CytoML 2.6.0 2021-10-26 [1] Bioconductor
## data.table * 1.14.2 2021-09-27 [1] CRAN (R 4.1.3)
## DBI 1.1.2 2021-12-20 [1] CRAN (R 4.1.3)
## dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.3)
## DelayedArray 0.20.0 2021-10-26 [1] Bioconductor
## DEoptimR 1.0-11 2022-04-03 [1] CRAN (R 4.1.3)
## desc 1.4.1 2022-03-06 [1] CRAN (R 4.1.3)
## destiny 3.8.1 2022-01-30 [1] Bioconductor
## devtools * 2.4.3 2021-11-30 [1] CRAN (R 4.1.3)
## digest 0.6.29 2021-12-01 [1] CRAN (R 4.1.3)
## doParallel 1.0.17 2022-02-07 [1] CRAN (R 4.1.3)
## dplyr * 1.0.8 2022-02-08 [1] CRAN (R 4.1.3)
## e1071 1.7-9 2021-09-16 [1] CRAN (R 4.1.3)
## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.3)
## evaluate 0.15 2022-02-18 [1] CRAN (R 4.1.3)
## fansi 1.0.3 2022-03-24 [1] CRAN (R 4.1.3)
## farver 2.1.0 2021-02-28 [1] CRAN (R 4.1.3)
## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.3)
## flowCore 2.6.0 2021-10-26 [1] Bioconductor
## FlowSOM 2.2.0 2021-10-26 [1] Bioconductor
## flowWorkspace 4.6.0 2021-10-26 [1] Bioconductor
## forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.1.3)
## foreach 1.5.2 2022-02-02 [1] CRAN (R 4.1.3)
## fs 1.5.2 2021-12-08 [1] CRAN (R 4.1.3)
## generics 0.1.2 2022-01-31 [1] CRAN (R 4.1.3)
## GenomeInfoDb 1.30.1 2022-01-30 [1] Bioconductor
## GenomeInfoDbData 1.2.7 2022-04-23 [1] Bioconductor
## GenomicRanges 1.46.1 2021-11-18 [1] Bioconductor
## GetoptLong 1.0.5 2020-12-15 [1] CRAN (R 4.1.3)
## ggcyto 1.22.0 2021-10-26 [1] Bioconductor
## ggforce 0.3.3 2021-03-05 [1] CRAN (R 4.1.3)
## ggnewscale 0.4.7 2022-03-25 [1] CRAN (R 4.1.3)
## ggplot.multistats 1.0.0 2019-10-28 [1] CRAN (R 4.1.3)
## ggplot2 * 3.3.5 2021-06-25 [1] CRAN (R 4.1.3)
## ggpointdensity 0.1.0 2019-08-28 [1] CRAN (R 4.1.3)
## ggpubr 0.4.0 2020-06-27 [1] CRAN (R 4.1.3)
## ggrepel 0.9.1 2021-01-15 [1] CRAN (R 4.1.3)
## ggridges * 0.5.3 2021-01-08 [1] CRAN (R 4.1.3)
## ggsignif 0.6.3 2021-09-09 [1] CRAN (R 4.1.3)
## ggthemes 4.2.4 2021-01-20 [1] CRAN (R 4.1.3)
## GlobalOptions 0.1.2 2020-06-10 [1] CRAN (R 4.1.3)
## glue 1.6.2 2022-02-24 [1] CRAN (R 4.1.3)
## gplots 3.1.1 2020-11-28 [1] CRAN (R 4.1.3)
## graph 1.72.0 2021-10-26 [1] Bioconductor
## gridExtra 2.3 2017-09-09 [1] CRAN (R 4.1.3)
## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.3)
## gtools 3.9.2 2021-06-06 [1] CRAN (R 4.1.3)
## haven 2.4.3 2021-08-04 [1] CRAN (R 4.1.3)
## hexbin 1.28.2 2021-01-08 [1] CRAN (R 4.1.3)
## highr 0.9 2021-04-16 [1] CRAN (R 4.1.3)
## hms 1.1.1 2021-09-26 [1] CRAN (R 4.1.3)
## htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.3)
## htmlwidgets 1.5.4 2021-09-08 [1] CRAN (R 4.1.3)
## httpuv 1.6.5 2022-01-05 [1] CRAN (R 4.1.3)
## httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.3)
## igraph * 1.2.11 2022-01-04 [1] CRAN (R 4.1.3)
## IRanges 2.28.0 2021-10-26 [1] Bioconductor
## irlba 2.3.5 2021-12-06 [1] CRAN (R 4.1.3)
## isoband 0.2.5 2021-07-13 [1] CRAN (R 4.1.3)
## iterators 1.0.14 2022-02-05 [1] CRAN (R 4.1.3)
## jpeg 0.1-9 2021-07-24 [1] CRAN (R 4.1.1)
## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.1.3)
## jsonlite 1.8.0 2022-02-22 [1] CRAN (R 4.1.3)
## KernSmooth 2.23-20 2021-05-03 [1] CRAN (R 4.1.3)
## knitr * 1.38 2022-03-25 [1] CRAN (R 4.1.3)
## labeling 0.4.2 2020-10-20 [1] CRAN (R 4.1.1)
## laeken 0.5.2 2021-10-06 [1] CRAN (R 4.1.3)
## later 1.3.0 2021-08-18 [1] CRAN (R 4.1.3)
## lattice 0.20-45 2021-09-22 [1] CRAN (R 4.1.3)
## latticeExtra 0.6-29 2019-12-19 [1] CRAN (R 4.1.3)
## lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.3)
## lmtest 0.9-40 2022-03-21 [1] CRAN (R 4.1.3)
## lubridate 1.8.0 2021-10-07 [1] CRAN (R 4.1.3)
## magrittr 2.0.2 2022-01-26 [1] CRAN (R 4.1.3)
## MASS 7.3-55 2022-01-16 [1] CRAN (R 4.1.3)
## Matrix * 1.4-0 2021-12-08 [1] CRAN (R 4.1.3)
## MatrixGenerics 1.6.0 2021-10-26 [1] Bioconductor
## matrixStats 0.61.0 2021-09-17 [1] CRAN (R 4.1.3)
## memoise 2.0.1 2021-11-26 [1] CRAN (R 4.1.3)
## mgcv 1.8-39 2022-02-24 [1] CRAN (R 4.1.3)
## mime 0.12 2021-09-28 [1] CRAN (R 4.1.1)
## miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.1.3)
## modelr 0.1.8 2020-05-19 [1] CRAN (R 4.1.3)
## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.3)
## ncdfFlow 2.40.0 2021-10-26 [1] Bioconductor
## needs * 0.0.3 2016-03-28 [1] CRAN (R 4.1.1)
## nlme 3.1-155 2022-01-16 [1] CRAN (R 4.1.3)
## nnet 7.3-17 2022-01-16 [1] CRAN (R 4.1.3)
## pacman * 0.5.1 2019-03-11 [1] CRAN (R 4.1.3)
## pcaMethods 1.86.0 2021-10-26 [1] Bioconductor
## pdist 1.2 2013-02-03 [1] CRAN (R 4.1.1)
## permute 0.9-7 2022-01-27 [1] CRAN (R 4.1.3)
## pheatmap * 1.0.12 2019-01-04 [1] CRAN (R 4.1.3)
## pillar 1.7.0 2022-02-01 [1] CRAN (R 4.1.3)
## pkgbuild 1.3.1 2021-12-20 [1] CRAN (R 4.1.3)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.3)
## pkgload 1.2.4 2021-11-30 [1] CRAN (R 4.1.3)
## plyr * 1.8.7 2022-03-24 [1] CRAN (R 4.1.3)
## png 0.1-7 2013-12-03 [1] CRAN (R 4.1.1)
## polyclip 1.10-0 2019-03-14 [1] CRAN (R 4.1.1)
## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.3)
## processx 3.5.3 2022-03-25 [1] CRAN (R 4.1.3)
## promises 1.2.0.1 2021-02-11 [1] CRAN (R 4.1.3)
## proxy 0.4-26 2021-06-07 [1] CRAN (R 4.1.3)
## ps 1.6.0 2021-02-28 [1] CRAN (R 4.1.3)
## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.1.3)
## R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.3)
## ranger 0.13.1 2021-07-14 [1] CRAN (R 4.1.3)
## RANN 2.6.1 2019-01-08 [1] CRAN (R 4.1.3)
## RBGL 1.70.0 2021-10-26 [1] Bioconductor
## RColorBrewer 1.1-3 2022-04-03 [1] CRAN (R 4.1.3)
## Rcpp 1.0.8.3 2022-03-17 [1] CRAN (R 4.1.3)
## RcppAnnoy 0.0.19 2021-07-30 [1] CRAN (R 4.1.3)
## RcppEigen 0.3.3.9.2 2022-04-08 [1] CRAN (R 4.1.3)
## RcppHNSW 0.3.0 2020-09-06 [1] CRAN (R 4.1.3)
## D RcppParallel 5.1.5 2022-01-05 [1] CRAN (R 4.1.3)
## RCurl 1.98-1.6 2022-02-08 [1] CRAN (R 4.1.2)
## readr * 2.1.2 2022-01-30 [1] CRAN (R 4.1.3)
## readxl 1.3.1 2019-03-13 [1] CRAN (R 4.1.3)
## remotes 2.4.2 2021-11-30 [1] CRAN (R 4.1.3)
## reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.3)
## reshape2 1.4.4 2020-04-09 [1] CRAN (R 4.1.3)
## Rgraphviz 2.38.0 2021-10-26 [1] Bioconductor
## rjson 0.2.21 2022-01-09 [1] CRAN (R 4.1.2)
## rlang 1.0.2 2022-03-04 [1] CRAN (R 4.1.3)
## rmarkdown 2.13 2022-03-10 [1] CRAN (R 4.1.3)
## robustbase 0.95-0 2022-04-02 [1] CRAN (R 4.1.3)
## Rphenograph * 0.99.1 2022-03-30 [1] Github (JinmiaoChenLab/Rphenograph@0298487)
## rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.1.3)
## RProtoBufLib 2.6.0 2021-10-26 [1] Bioconductor
## RSpectra 0.16-0 2019-12-01 [1] CRAN (R 4.1.3)
## rstatix 0.7.0 2021-02-13 [1] CRAN (R 4.1.3)
## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.3)
## Rtsne 0.16 2022-04-17 [1] CRAN (R 4.1.3)
## rvest 1.0.2 2021-10-16 [1] CRAN (R 4.1.3)
## S4Vectors 0.32.3 2021-11-21 [1] Bioconductor
## sass 0.4.1 2022-03-23 [1] CRAN (R 4.1.3)
## scales 1.2.0 2022-04-13 [1] CRAN (R 4.1.3)
## scattermore 0.8 2022-02-14 [1] CRAN (R 4.1.3)
## scatterplot3d 0.3-41 2018-03-14 [1] CRAN (R 4.1.1)
## sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.3)
## shape 1.4.6 2021-05-19 [1] CRAN (R 4.1.1)
## shiny 1.7.1 2021-10-02 [1] CRAN (R 4.1.3)
## shinyFiles 0.9.1 2021-11-10 [1] CRAN (R 4.1.3)
## SingleCellExperiment 1.16.0 2021-10-26 [1] Bioconductor
## smoother 1.1 2015-04-16 [1] CRAN (R 4.1.3)
## sp 1.4-7 2022-04-20 [1] CRAN (R 4.1.3)
## stringi 1.7.6 2021-11-29 [1] CRAN (R 4.1.2)
## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.1.3)
## SummarizedExperiment 1.24.0 2021-10-26 [1] Bioconductor
## testthat 3.1.2 2022-01-20 [1] CRAN (R 4.1.3)
## tibble * 3.1.6 2021-11-07 [1] CRAN (R 4.1.3)
## tidyr * 1.2.0 2022-02-01 [1] CRAN (R 4.1.3)
## tidyselect 1.1.2 2022-02-21 [1] CRAN (R 4.1.3)
## tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.1.3)
## TTR 0.24.3 2021-12-12 [1] CRAN (R 4.1.3)
## tweenr 1.0.2 2021-03-23 [1] CRAN (R 4.1.3)
## tzdb 0.2.0 2021-10-27 [1] CRAN (R 4.1.3)
## usethis * 2.1.5 2021-12-09 [1] CRAN (R 4.1.3)
## utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.3)
## uwot * 0.1.11 2021-12-02 [1] CRAN (R 4.1.3)
## vcd 1.4-9 2021-10-18 [1] CRAN (R 4.1.3)
## vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.3)
## vegan 2.6-2 2022-04-17 [1] CRAN (R 4.1.3)
## VGAM 1.1-6 2022-02-14 [1] CRAN (R 4.1.3)
## VIM 6.1.1 2021-07-22 [1] CRAN (R 4.1.3)
## withr 2.5.0 2022-03-03 [1] CRAN (R 4.1.3)
## xfun 0.30 2022-03-02 [1] CRAN (R 4.1.3)
## XML 3.99-0.9 2022-02-24 [1] CRAN (R 4.1.2)
## xml2 1.3.3 2021-11-30 [1] CRAN (R 4.1.3)
## xtable 1.8-4 2019-04-21 [1] CRAN (R 4.1.3)
## xts 0.12.1 2020-09-09 [1] CRAN (R 4.1.3)
## XVector 0.34.0 2021-10-26 [1] Bioconductor
## yaml 2.3.5 2022-02-21 [1] CRAN (R 4.1.2)
## zlibbioc 1.40.0 2021-10-26 [1] Bioconductor
## zoo 1.8-10 2022-04-15 [1] CRAN (R 4.1.3)
##
## [1] D:/Programme/R/R-4.1.3/library
##
## D -- DLL MD5 mismatch, broken installation.
##
## ------------------------------------------------------------------------------